Incorporating Prosodic with Acoustic information for ISCSLP’2006 Speaker Recognition Evaluation- Robust Cross-Channel Speaker Verification

نویسندگان

  • Wen-Chieh Chang
  • Ding-Yun Chen
  • Zi-He Chen
  • Zhi-Ren Zeng
  • Yuan-Fu Liao
  • Yau-Tarng Juang
چکیده

In this paper, we present our speaker verification (SV) systems for the cross-channel text-independent and dependent speaker verification (TI-SV and TD-SV) tasks of ISCSLP’2006 speaker recognition evaluation (ISCSLP2006-SRE). To address the cross-channel issues and take advantage of the unique characteristics of Mandarin (i.e., tonal language), prosodic contours are modeled to assist the state-of-the-art spectral feature-based SV systems. Especially, two approaches are proposed including (1) latent prosody analysis (LPA) for modeling the prosodic behaviors of a speaker and (2) a Gaussian mixture model (GMM) for modeling the dynamics of the pitch and energy contours. Experimental results on the evaluation set of ISCSLP2006-SRE had demonstrated that the proposed methods of incorporating prosodic featurebased SV systems with spectral feature-based SV systems outperform the spectral feature only SV systems for both TIand TD-SV tasks, respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speaker Verification Using Complementary Information from Vocal Source and Vocal Tract

This paper describes a speaker verification system which uses two complementary acoustic features: Mel-frequency cepstral coefficients (MFCC) and wavelet octave coefficients of residues (WOCOR). While MFCC characterizes mainly the spectral envelope, or the formant structure of the vocal tract system, WOCOR aims at representing the spectro-temporal characteristics of the vocal source excitation....

متن کامل

ISCSLP SR Evaluation, UVA-CS_es System Description. A System Based on ANNs

This paper shows a description of the system used in the ISCSLP06 Speaker Recognition Evaluation, text independent cross-channel speaker verification task. It is a discriminative Artificial Neural Networkbased system, using the Non-Target Incremental Learning method to select world representatives. Two different training strategies have been followed: (i) to use world representative samples wit...

متن کامل

A Review of Various Score Normalization Techniques for Speaker Identification System

This paper presents an overview of a state-of-the-art text-independent speaker verification system using score normalization. First, an introduction proposes a modular scheme of the training and test phases of a speaker verification system. Then, the most commonly speech parameterization used in speaker verification, namely, cepstral analysis, is detailed. Normalization of scores is then explai...

متن کامل

Duration and pronunciation conditioned lexical modeling for speaker verification

We propose a method to improve speaker recognition lexical model performance using acoustic-prosodic information. More specifically, the lexical model is trained using durationand pronunciation-conditioned word N-grams, simultaneously modeling lexical information along with their acoustic and prosodic characteristics. Support vector machines are used for modeling and scoring, with N-gram freque...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006